Attenuated Signal Test Example

This notebook demonstrates an example of how an attenuation test can be used to detect biofouling on an instrument.

The source data is derived from a historical USGS CTD station located at Lower Sand Island, OR in the Columbia River estuary. The selected time period shows the tidal influence on salinity over a spring-neap time period. Near the end of the selected period, there is a decrease in the range of salinity corresponding with biofouling.

The data was downloaded from the Center for Coastal Margin and Prediction (CMOP) Data Explorer.

Setup

[1]:
# Setup directories
from pathlib import Path
basedir = Path().absolute()
libdir = basedir.parent.parent.parent

# Other imports
import pandas as pd
import numpy as np
from datetime import datetime

import matplotlib.pyplot as plt

from bokeh.layouts import gridplot
from bokeh import plotting
plotting.output_notebook()
Loading BokehJS ...
[2]:
# Install QC library
!pip install git+git://github.com/ioos/ioos_qc.git

# # Alternative installation (install specific branch):
# !pip uninstall -y ioos_qc
# !pip install git+git://github.com/ioos/ioos_qc.git@attenuated_signal_updates

# # Alternative installation (run with local updates):
# !pip uninstall -y ioos_qc
# import sys
# sys.path.append(str(libdir))

from ioos_qc.config import QcConfig
from ioos_qc import qartod
Collecting git+git://github.com/ioos/ioos_qc.git
  Cloning git://github.com/ioos/ioos_qc.git to /tmp/pip-req-build-unw_ljct
  Running command git clone -q git://github.com/ioos/ioos_qc.git /tmp/pip-req-build-unw_ljct
Requirement already satisfied (use --upgrade to upgrade): ioos-qc==1.0.0 from git+git://github.com/ioos/ioos_qc.git in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages
Requirement already satisfied: geojson in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (2.5.0)
Requirement already satisfied: netCDF4 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (1.5.3)
Requirement already satisfied: numpy>=1.14 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (1.18.1)
Requirement already satisfied: pygc in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (1.2.1)
Requirement already satisfied: ruamel.yaml in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (0.16.6)
Requirement already satisfied: simplejson in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (3.17.0)
Requirement already satisfied: xarray in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from ioos-qc==1.0.0) (0.15.1)
Requirement already satisfied: cftime in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from netCDF4->ioos-qc==1.0.0) (1.1.1.2)
Requirement already satisfied: pandas>=0.25 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from xarray->ioos-qc==1.0.0) (1.0.3)
Requirement already satisfied: setuptools>=41.2 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from xarray->ioos-qc==1.0.0) (46.1.3.post20200325)
Requirement already satisfied: pytz>=2017.2 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from pandas>=0.25->xarray->ioos-qc==1.0.0) (2019.3)
Requirement already satisfied: python-dateutil>=2.6.1 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from pandas>=0.25->xarray->ioos-qc==1.0.0) (2.8.1)
Requirement already satisfied: six>=1.5 in /home/travis/miniconda/envs/test-environment/lib/python3.8/site-packages (from python-dateutil>=2.6.1->pandas>=0.25->xarray->ioos-qc==1.0.0) (1.14.0)
Building wheels for collected packages: ioos-qc
  Building wheel for ioos-qc (setup.py) ... - done
  Created wheel for ioos-qc: filename=ioos_qc-1.0.0-py3-none-any.whl size=31802 sha256=b38ce8d30ae903ce08dc77800d13fb1255501b39cf526e30d4cb4fc69ddc0fb4
  Stored in directory: /tmp/pip-ephem-wheel-cache-9h3k9td9/wheels/33/ab/58/56ed337d0f6f3af441234c995292c97819c9695750f647bf25
Successfully built ioos-qc
[3]:
# Method to plot QC results using Bokeh
def plot_results(data, var_name, results, title, test_name):

    time = data.index
    obs = data[var_name]
    qc_test = results['qartod'][test_name]

    qc_pass = np.ma.masked_where(qc_test != 1, obs)
    qc_suspect = np.ma.masked_where(qc_test != 3, obs)
    qc_fail = np.ma.masked_where(qc_test != 4, obs)
    qc_notrun = np.ma.masked_where(qc_test != 2, obs)

    p1 = plotting.figure(x_axis_type="datetime", title=test_name + ' : ' + title)
    p1.grid.grid_line_alpha=0.3
    p1.xaxis.axis_label = 'Time'
    p1.yaxis.axis_label = 'Observation Value'

    p1.line(time, obs,  legend_label='obs', color='#A6CEE3')
    p1.circle(time, qc_notrun, size=2, legend_label='qc not run', color='gray', alpha=0.2)
    p1.circle(time, qc_pass, size=4, legend_label='qc pass', color='green', alpha=0.5)
    p1.circle(time, qc_suspect, size=4, legend_label='qc suspect', color='orange', alpha=0.7)
    p1.circle(time, qc_fail, size=6, legend_label='qc fail', color='red', alpha=1.0)
    p1.circle(time, qc_notrun, size=6, legend_label='qc not eval', color='gray', alpha=1.0)

    plotting.show(gridplot([[p1]], plot_width=800, plot_height=400))

## Run example

1. Load data and perform exploratory analysis

[4]:
# Historical salinity data from Lower Sand Island, Columbia River Estuary
# data: http://amb6400b.stccmop.org/ws/product/offeringplot_ctime.py?handlegaps=true&series=time,sandi.790.A.CTD.salt.PD0&width=8.54&height=2.92&starttime=2001-7-1 0:00&endtime=2001-09-5 23:59
# location same as sandi for CREOFS: https://tidesandcurrents.noaa.gov/ofs/ofs_station.shtml?stname=Lower%20Sand%20Island%20Light&ofs=cre&stnid=sandi&subdomain=ba
filename = basedir.joinpath('attenuated_salinity_example.csv')

data = pd.read_csv(filename, header=1, index_col='\'time PST\'', parse_dates=True)
data.plot()
[4]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f4d755ea9d0>
../_images/examples_QartodTestExample_SalinityAttenuation_7_1.png
[5]:
var_name = data.columns[0]
data[var_name]
[5]:
'time PST'
2001-08-15 19:06:55    27.37
2001-08-15 19:07:55    27.03
2001-08-15 19:08:55    26.78
2001-08-15 19:09:55    26.73
2001-08-15 19:10:55    27.10
                       ...
2001-09-01 20:51:57    15.71
2001-09-01 20:52:57    15.48
2001-09-01 20:53:57    15.46
2001-09-01 20:54:57    15.56
2001-09-01 20:55:57    14.61
Name:  sandi.790.A.CTD [psu], Length: 24436, dtype: float64

2. Plot range and standard deviation of salinity over M2 moving window

[6]:
# lunar day (M2)
time_delta = 3600 * 24.8
print(f'window: {time_delta}')

# start QC after half a tidal day
min_period_secs = time_delta/2
# one obs per minute
min_period_obs = time_delta/2/60
# panadas uses phrase "min_periods" to indicate minimum number of observations in the window
# - ioos_qc uses the phrase "min_period" to indicate minimum number of seconds in the window
# Depending on your use case, you can define the period in number of observations or number of seconds
print(f'min_period (secs): {min_period_secs}')
print(f'min_period (num obs): {min_period_obs}')
window: 89280.0
min_period (secs): 44640.0
min_period (num obs): 744.0
[7]:
# range check
range_data = data.rolling(f'{time_delta}S', min_periods=int(min_period_obs)).apply(np.ptp, raw=True)
range_data.plot()
[7]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f4d761a9400>
../_images/examples_QartodTestExample_SalinityAttenuation_11_1.png
[8]:
# check that min_{periods, obs} are NaN
# - note that N-1 are NaNs
range_data.isna().sum()
[8]:
 sandi.790.A.CTD [psu]    743
dtype: int64
[9]:
# stdev check
stdev_data = data.rolling(f'{time_delta}S', min_periods=int(min_period_obs)).apply(np.std, raw=True)
stdev_data.plot()
[9]:
<matplotlib.axes._subplots.AxesSubplot at 0x7f4d75fc9e80>
../_images/examples_QartodTestExample_SalinityAttenuation_13_1.png

3. Run QC and plot results

Test range

The beginning 743 points (min_period) are marked as “NOT EVALUATED” because there is not enough data yet to evaluate whether they are pass or fail.

The range of the signal falls so quickly that no points are marked as “SUSPECT”, but immediately change from “PASS” to “FAIL”.

[10]:
# QC configuration
# This configuration is used to call the corresponding method in the ioos_qc library
# See documentation for description of each test and its inputs:
#   https://ioos.github.io/ioos_qc/api/ioos_qc.html#module-ioos_qc.qartod
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 17.25,
            "fail_threshold": 15,
            "test_period": int(time_delta),
            "min_period": min_period_secs,
            "check_type": "range"
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"

p1 = plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')
Test standard deviation

The “standard deviation” test picks up likely suspect data whereas the “range” test did not. The exemplifies the utility of using both tests in tandem.

[11]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": min_period_secs,
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

## Sensitivity Tests

1. Test results sensitivity to min_period

The following plots demonstrate the sensitivity of the test results in the beginning of a time series to the selection of min_period.

min_period: 0

No observations are marked suspect at the beginning of the time series, but the first 744 observations (the size of the rolling window) are labeled “FAILING”.

[12]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": 0
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

min_period: 10min

Only the first 10s observations are marked as NOT EVALUATED, but the remainder of the first 744 samples are labeled as “FAILING”.

[13]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": 10*60
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

min_period: 1h

[14]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": 60*60
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

min_period: test_period/2

The first 744 samples are marked “NOT EVALUATED”, but none are marked as “FAILING”.

[15]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": int(time_delta/2)
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

2. Test results sensitivity to suspect_threshold.

suspect_threshold: 7

[16]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 7,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": min_period_secs
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

suspect_threshold: 6

[17]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 6,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": min_period_secs
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')

suspect_threshold: 5

[18]:
qc_config = {
    'qartod': {
        "attenuated_signal_test": {
            "suspect_threshold": 5,
            "fail_threshold": 4.5,
            "check_type": "std",
            "test_period": int(time_delta),
            "min_period": min_period_secs
      }
    }
}
qc = QcConfig(qc_config)
qc_results = qc.run(
    inp=data[var_name],
    tinp=data.index.values
)

title = "Salinity [psu] : Lower Sand Island, OR"
plot_results(data, var_name, qc_results, title, 'attenuated_signal_test')